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Abstract 

In this paper, channel estimation and data detection for multihop relaying orthogonal frequency 
division multiplexing (OFDM) system is investigated under time-varying channel. Different from pre- 
vious works, which highly depend on the statistical information of the doubly-selective channel (DSC) 
and noise to deliver accurate channel estimation and data detection results, we focus on more practical 
scenarios with unknown channel orders and Doppler frequencies. Firstly, we integrate the multilink, 
multihop channel matrices into one composite channel matrix. Then, we formulate the unknown channel 
using generalized complex exponential basis expansion model (GCE-BEM) with a large oversampling 
factor to introduce channel sparsity on delay-Doppler domain. To enable the identification of nonzero 
entries, sparsity enhancing Gaussian distributions with Gamma hyperpriors are adopted. An iterative 
algorithm is developed under variational inference (VI) framework. The proposed algorithm iteratively 
estimate the channel, recover the unknown data using Viterbi algorithm and learn the channel and noise 
statistical information, using only limited number of pilot subcarrier in one OFDM symbol. Simulation 
results show that, without any statistical information, the performance of the proposed algorithm is very 
close to that of the optimal channel estimation and data detection algorithm, which requires specific 
information on system structure, channel tap positions, channel lengths, Doppler shifts as well as noise 
powers. 

Rui Min and Yik-Chung Wu are with the Department of Electrical and Electronic Engineering, The University of Hong Kong, 
Pokfulam Road, Hong Kong. Email:{minrui,ycwu}@eee.hku.hk. 
"The corresponding author is Yik-Chung Wu. 



March 4, 2013 



DRAFT 



2 



Index Terms 

Doubly-selective channel, Channel estimation, Data detection, Variational inference, Orthogonal 
frequency division multiplexing, Multihop relaying system 

I. Introduction 

Next generation broadband system aims to support higher levels of mobility, connectivity 
and efficiency. Multihop relaying system is a perfect suit for such requirement due to their 
benefits in easy deployment, enhanced connectivity, flexible adaptability, and increased capacity. 
On the other hand, orthogonal frequency division multiplexing (OFDM) has been adopted as 
the transmission scheme for many next generation broadband standards, such as WiMAX, 
LTE and IEEE 802.16. These result in the need to develop receiver algorithms for multihop 
OFDM system under high mobility. With high mobility, the broadband wireless channel is both 
frequency-selective and time-varying, a.k.a. doubly-selective. The channel responses vary sample 
by sample, which destroy the orthogonal property among subcarriers and causes intercarrier 
interference (ICI). Besides, the relaying system structure and channel statistical information are 
generally unknown to the receiver, due to flexible configuration of relaying paths. These poses 
strong challenges to channel estimation and data detection of OFDM relaying system under high 
mobility. 

Over doubly- selective channel (DSC), channel estimation and data detection for single-hop 
OFDM systems has been considered in |[l]-||6j, wnere me two tasks are treated separately. In 
JTJ-Q, the frequency domain channel matrix is approximated with a diagonal matrix under the 
assumption of small normalized Doppler frequency. The resulting algorithm would produce poor 
channel estimate and subsequently degrade the data detection performance for fast time-varying 
DSC. In view of that, [5] and [6] assumed a banded frequency-domain channel matrix, thus 
achieve a better channel modeling accuracy. However, due to the ICI introduced in frequency 
domain, pilots and data would interfere each other. This leads to the interdependence between 
channel estimation and data detection, and joint processing of them is necessary. 

Research on multihop channel estimation is still limited especially for DSC, due to the 
complicated system structure and uncertain time-varying channel property. Among the limited 
existing works, [7] studied the two-way relaying (TWR) system under frequency-nonselective 
time-varying channel, and complex exponential basis expansion model (CE-BEM) was used to 
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reduce the number of channel parameters. However, [|7J only considered single carrier system 
and extension to multicarrier system is not straightforward due to additional ICI. More recently, 
in [[8j, an iterative algorithm for data detection and channel estimation was proposed for dual- 
hop amplify-and-forward (AF) OFDM system. With complete information of channel and noise 
in each hop, the data detection results were very close to the ideal case. Unfortunately, this 
work did not consider multihop relaying system, which has a more complicated structure. More 
importantly, it requires the destination receiver to have full statistical information of all channels 
and noise powers, which might not be readily available in practice. 

In this paper, we study the channel estimation and data detection of OFDM-based multihop 
AF relaying system under high mobility, with special focus on unknown channel orders and 
Doppler frequencies. Based on the fact that the combined channel information is sufficient 
for data detection, and in order to reduce computation load on relays and time delay of the 
whole system, no channel estimation is performed at the relays. Different from previous works 
which highly rely on information of system structure, channel tap positions, channel lengths 
and Doppler frequencies of all channels, as well as noise powers at all receivers, we propose 
to solve the problems with none of the above information. By first expanding the composite 
source-relay-destination channel using generalized complex exponential basis expansion model 
(GCE-BEM) with a large oversampling factor, we introduce channel sparsity on delay-Doppler 
domain. Then sparsity enhancing Gaussian distributions with Gamma hyperpriors are adopted 
for channel estimation to enable the identification of nonzero elements. An iterative algorithm 
is proposed based on variational inference (VI) framework to iteratively estimate the channel, 
recover the unknown data using Viterbi algorithm and learn the channel and noise statistical 
information, using only limited number of pilot subcarrier in one OFDM symbol. Simulation 
results show that the performance of the proposed algorithm is very close to that of an optimal 
algorithm, which requires detailed statistical information on channels and noises. 

The rest of the paper is organized as follows. The OFDM-based multihop relaying system is 
introduced in Section |n} Then in Section III, the channel matrices of all the hops are integrated 



into one concise composite channel matrix and reformulated using GCE-BEM with a large 
oversampling factor. The iterative channel estimation and data detection algorithm is developed 



under VI framework in Section IV And in Section M least squares (LS) channel estimator 



and equalizer are derived to obtain the initial parameters for the proposed iterative algorithm. 

March 4, 2013 DRAFT 



4 



Simulation results of the proposed algorithm are presented in Section [VTJ Finally, this paper is 
concluded in Section IVII1 

Notations: Boldface uppercase and lowercase letters will be used for matrices and vectors, 
respectively. Superscripts H, T and * denote Hermitian, transpose and conjugate, respectively. 
The symbol Ijv represents the Nx N identity matrix. Symbol e; denotes the vector with structure 
given as [Oi x j, 1, Oix(N-i-i)] T , where 0i x ; is the I dimension all-zero row vector. diag{x} stands 
for the diagonal matrix with vector x on its diagonal. The notation [X] mi . m2 ni:n2 represents the 
submatrix of X consists of entries on the mi-to-mf 1 rows and ni-to-nf 1 columns. E{-} denotes 
the expectation while Tr{X} and det{X} are the trace and the determinant of the square matrix 
X. Re{-} denotes the real part. And \x] rounds x to the nearest integer greater than or equal to x. 
Finally, F represents the discrete Fourier transform (DFT) matrix with [F] m n = -±= e -j 2 * mn / N _ 

II. System Model 

In this paper, we consider a multihop relaying system consists of a source S, a destination D 
and a number of relays scattered in the middle. Each of them is equipped with single antenna. 
Without loss of generality, we assume the relays work cooperatively to form K links, each of 
them consisting of T + 1 hops. Apart from the K relaying paths, there is no other link between 
§ and D, and all relays employ the AF scheme. Denoting the relay on the k th link connecting 
the p th and the (p + l) th hop as Mfc iP , the relaying system is shown in Figure [I] 

The channel of each hop is assumed to be doubly- selective channel (DSC). Specifically, at the 
p th hop of the k th relaying path, the channel consists of N k:P independent nonzero channel taps 
with maximum delay of (L^ — 1)T S , where T s is the sample interval. We consider the general 
situation that the channel taps are not necessarily consecutive, so that we have N kyP < Z^ x . 
Let hk, P {n,l) be the I th tap of that channel at time nT s . For a given p and k, the channel taps 
are independent and each one being a zero-mean complex Gaussian process with bandlimited 
power spectral density within [—fk, P (l),fk, P (l)}, where fk, p (l) is the maximum Doppler shift of 
the I th tap. In general, fk, P (l) may be distinct for different /, since each tap results from signal 
transmission through a different physical scattering. Furthermore, it is assumed that the channels 
for different links k and hops p are independent from each other. 
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A. OFDM Signal Transmitted from § 

In an OFDM system, the frequency domain source data x = [x(0), . . . ,x(N — 1)] T is first 
transformed to the time domain data s = F^x, where F represents the discrete Fourier transform 
(DFT) matrix. In order to facilitate channel estimation and data detection, pilots are inserted in 
the frequency domain as 

(xJn) V nGl 
x d {n) V n G 3d, 

where 3d is the index set of the Nd unknown data symbols, 3 P is the index set of the N p pilot 
symbols and we have N = Nd + N p . In matrix form, x can be represented as 

x = E d x d + EpXp, (2) 

where E d and E p , with dimensions N x N d and iV x N p , respectively, map x d and x p to their 
corresponding subcarriers. Before transmission, a Cyclic Prefix (CP) of length L cp is added at the 
beginning of s to prevent intersymbol interference (ISI). Since the OFDM signal goes through 
a number of relays before reaching the destination, L cp should be larger than the maximum 
channel length among all the relaying paths, denoted as L max = ma.x(J2^=i -^max ~~ ^0- 

B. Received OFDM Signal 

In AF relaying system, each relay merely amplify the received signal before passing the signal 

to the next relay or destination. For the k th relaying path, the signal received at IR^i is given by 

T k,i _, 

rk,i(n) = ^ h ki i(n,l)s(n-l)+w k>1 (n), (3) 

1=0 

where Wk,i(n) is the additive white Gaussian noise (AWGN) with power w\ x . Upon receiption, 
the relay amplifies the incoming signal as [|9) 

Zk,x{n) = ?fc,ir fcil (n) (4) 

and then transmits Zk,x(n) to the next relay ]R fcj2 through channel h k ^(n,l). The received signal 
is then represented as 

J kj2 i 

r k ,2(n) = ^2 h,2{n,l)z k;1 (n-l) +w ki2 {n), (5) 
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with Wk,2{ n ) being the AWGN with power w\ 2 . The signal is then amplified as z k ^{n) = 
Sk,2i"k,2{n) and transmitted to Rfc i3 and so on. Then, at the T th relay, the amplified signal is 
transmitted to the destination. Finally at destination D, the received signal is given by 

K T k ' r + 1 1 

y( n ) = S ^2 h k ,T+i(n,l)zk,r(n - 1) +w d (ri), (6) 

fc=l (=0 

where AWGN Wd(n) has power ro^. Upon reception, the CP [y(— L cp ), . . . ,y(— 1)] T is removed 
and the received signal y = [y(0), . . . , y(JV — 1)]^ can be written in matrix form as 

K 

y = ^2 Hfc,T+lZfc,T + Wd, (7) 

fe=i 

where z fciT = [z kt r(-(L^+ 1 - 1)), ... , z ktT (0), . . . , 2 fc)T (JV - 1)] T , w d is the noise vector with 
elements Wd{n), and H fc>T +i is an JV x (JV + £mJ x +1 — 1) channel matrix given by 

^,r + i(0,L^ +1 -l) ... Vr+i(0,0) 



H 



frfc,T+l(l) £ 



fc,T+l 



r 



fc,T+l 



1,0) 



Furthermore, z fc>T can be written in terms of z fcjT -i as 



^ )T+ i(iV-l,0) 



(8) 



(9) 



with z fcjT _! = [^,t-i(-(^L +1 + + 2), . . . , z fc>T -i(0), • • • , z kt r-i(N - 1)] T , w,, x is the 

corresponding noise vector, and H fc)T is an (JV + i^J x +1 — 1) x (JV + L^J^ 1 + L^ x — 2) matrix 
given by 

" h k! r(l - L*& - 1) ... h k , r (l - 0) 

h ktT (2 - iW+i, l*£ - 1) ... ^,r(2 - 0) 



H 



fc,T 



(10) 



^, T (JV-1,L^-1) ••• hAN-1,0) 
Tracing back to the I s * hop, we have z k) i = SjfcjiH^iSfc + g^iw^i, where H^i is an (iV + 
Eji 1 L mL - T) x (JV + Ej=i X L max - T - 1) channel matrix with structure the same as M 
and <flOJ>, and s fc = E fc s with E fc = [[Ijv]^ (n-J2 t+ 1 l^+t+2)-n> ^ n ~\ T characterizing the effect 
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of the CP. Based on the above derivations, the received signal vector y is 



K 

E 

k=l 
v 



K 



k=l 
\ 



T 



[ J Sk,p J (Hfc_T+l • • • Hk,l) 
vp=l 



F"x 



— 



T / / T 

( ( II ft >« ) (Sfe,T+i • • • H fcjP+1 ) ] w fc „ 

.p=l V \£>=p 



+ w d , 



(ID 



where H represents the composite channel matrix and v represents the composite noise effect. 



III. Reformulation of the Composite Channel Matrix 



In order to estimate the channel and detect the data, it is important to investigate the structure of 
the channel matrix H. Writing H = J2k=i (llj=i ft,p) H fc> where H fc = H fcjT +iH fci T . . . H M E fc . 
To find out the structure of H fc T+1 H fc T • • • H fc t iE fc , we start from H fc T +i an d H fe T with their 
expressions given in ([8]) and (10), respectively. Each matrix represents the linear convolution 
of a time- varying channel and the matrix multiplication expresses the convolution effect of two 
time-varying channels. Therefore the resulting matrix H fc T+1 H fc T will also be in the form of 
([8]) and (10), except that the resulting channel length of the new time-varying channel is now 
being L 



fc,T+l i rfc,T _ -I 
max ' max 



Similarly, multiplying H^-f-i to H fc T+1 H fc T from the right, the result H fc T +iH fe T H fc T -i 
will be an A^x (N+J2 p= y-i — 3) matrix, with equivalent channel length of J2 P =y-i ^max~ 2, due 
to the convolution effect. Continuing the matrix multiplication, we have H fc T +i • • • Hfc ;1 being an 
N x (N + Y^=i L^ p ax - T - 1) matrix, with equivalent channel length of ^max - T - And 

eventually E fc moves the J2 p =i -^max — T — 1 columns from the left part of H fc>T +i • • • Hfc,i to the 
upper right corner. The resulted composite channel matrix H fc is an N x N circular convolution 
matrix of a time- varying channel with equivalent channel length of Yl^=i -^max — Thus H, as 
the weighted sum of H^'s, has the same circular convolution matrix structure of a time-varying 
channel with length L max = max(J2l=} ^max - T ) : 



H 



£(o,o) o 
MM) £(i,o) 





MO, Anax - 1) 



fl(N-l,L max -l) 



/i(l,L max - 1) 



£(0,1) 



£(iV-l,0) 



(12) 
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or equivalently 

^max 1 

H= diag{A ; }P(0, (13) 

1=0 

where \x x = [/i(0, /),..., p,(N — 1, l)) T consists of all the composite channel coefficients of the 
I th tap and P(/) = [ej, . . . , ejv-i, e , . . . , ej_i]. Thus ( 11 ) becomes 

y= Yl diag{^}P(/)F H x + v. (14) 

1=0 

It should be noticed that, the receiver knows neither the individual channel information of each 
hop nor the statistical information about the composite channel. This is a natural assumption, 
as the channels are time-varying and depend on the speed of transceivers and the environment 
around them. In particular, the receiver has no knowledge on the composite channel tap positions 
(if the tap positions of individual channel are not consecutive) and the maximum Doppler shift 

/max = max (X]I=L 1 max fkp{l))- Furthermore, the noise power at each relay M.k p is not 

k Je[o,i&£J 

available to the receiver either. As a result, the receiver has no information on the composite 
noise power. 

In order to proceed, we propose to calculate an upper bound on the maximum Doppler shift and 
the delay for the composite channel. Let t> max be the maximum relative velocity between two units 
in any hop in the relaying system. Since v max f c /c > fk, P {l) f° r all k, p and I, where f c and c are 
the carrier frequency and the speed of light, respectively, we have / max < fu = (T+ {)v max f c /c. 
And in the delay domain, the best the receiver knows is that L cp is chosen large enough to avoid 
ISI. Thus L max < L cp and all the nonzero taps fall in the range of {0, . . . ,L cp — 1}. With 
the ranges of the delay-Doppler domain defined for the composite channel, we can expand the 
channel with generalized complex exponential basis expansion model (GCE-BEM) as follows 

Q 

p(n, 0=2 V q (l> j27rqn/VN , l = 0,...,L cp -l, n = 0, . . . , N - 1, (15) 

q=-Q 

where Q = \VNfuT s ~\ and V is the oversampling factor, and (JL q (i) is the GCE-BEM coefficient. 
It should be noticed that p q {l) = in two conditions: 1) ft(n,l) = 0; 2) |g| > VNf max T s . 



From ( 15 ), the vector can be expressed as = Ylq=-o ¥ 5 (?)/ i g(0' where cp(q) = [1, e j27Tq ^ VN 



e j2wq(N-i)/VN^T denotes the q th basis vector. Putting this result into (14), taking the DFT 
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on the signal y and replacing the unknown L max with L cp , we have 

Lcp-l Q 

y = Fy = Yl Fdia 8{ E V(9)^ ff (0}P(0F H x + v 

z=o <?=-<3 

= ^ ^[Fdiag{^(g)}P(/)F^x]^(/)+v, (16) 

Z=0 q=-Q 

where v = Fv represents the noise vector after DFT. Let fi = [/x 9 (0), . . . , fi q (L cp — 1)] T , then 



(16) can be written as 



Q 

y=J2 [Fdiag{^(g)}P(0)F^x, . . . , Fdiag{cp(g)}P(L cp - l)F H x]/x, + v. (17) 

Further define [i = [n T Q , ■ ■ ■ , A*q] t and let G[x] = [G_q[x], . . . , Gq[x]], thus we have 

y = G[x]^ + v. (18) 



On the other hand, from (11 ), let D[/x] = FHF^, the system model can also be written as 

y = D[/^]x + v. (19) 

It is clear that D[/x]x = G[x]/Lt. 

IV. Iterative Channel Estimation and Data Detection 



From the system model ( 18 ) and ( 19 ), the problem is to jointly estimate the composite channel 
BEM coefficients and the unknown data x^, without the knowledge of the composite noise 
variance, denoted by w\. Since we have expanded the composite channel over an extended 
range in the delay-Doppler plane, we also want to make use of the prior information that most 



of the BEM coefficients will be zero (i.e., is sparse). It is noticed from (18) and (19) that, 
estimation of channel requires knowledge of data and vice versa, thus leads to challenges in 
joint channel estimation and data detection. In this paper, a variational framework is adopted 
to iteratively improve the channel estimation and data detection results. Compared with other 
iterative frameworks, e.g., expectation-maximization (EM), VI is more general as it works within 
a complete Bayesian paradigm and gives a posterior distribution over all the parameters. Below, 
we first assign prior distributions to the unknown parameters. 
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A. Prior Distributions of the Unknown Parameters 

First, the prior distribution of /j, is assumed to be Gaussian 



= Tr^det(A-i) ex P^ A ^' ( 2 °) 

T 



where M = (2Q + l)L cp , A = diag{o;} and a = [aci, . . . , «m] is a vector containing the 



inverse variance of the elements of n. Then a hyperprior for a is specified as |10| 

p(oij) = Ga(otj\aj,bj) = 'at* 1 1 exp(— 6jQj)/r(aj), (21) 



with parameters a,-, Although the Gaussian prior given by (20) does not have strong prob- 



ability peaks for sparsity promotion, by working with (21), the marginal prior p(fx) obtained 
by integrating out a is a t— distribution, which nicely approximates a Laplace distribution p0| . 
Laplace prior is widely adopted in Ll-norm regularization schemes like Basis Pursuit (BP) 
pT| . Unfortunately, using the Laplace prior directly does not lead to a tractable variational 
treatment JT0| |. As a result, BP is usually used in one-shot sparse channel estimation [12|-[14|. 



Furthermore, BP or BP denoising methods rely on the noise power information [15|, which 
is not known in our case. Thus, the above hierarchical prior structure, which is both sparsity 
promoting and analytically tractable, is a suitable alternative for our problem. 

For x d , since we do not have knowledge on its value before observing the received signal, 
we set equal preference to all constellation points. Furthermore, due to the independent property 
among data elements, we have 

i tt r i 

Pfoi) = T^Vd II [ zJ 6 ( x d( n ) - Zd(n))\ , (22) 

d n=l x d (n)eC d 

where is the constellation points of the modulation and M. d is the modulation order, e.g., 
M d = 4 for QPSK. 

Besides, the unknown noise power is assumed to obey a Gamma prior, such that it can be 
learned under the variational framework. For ease of expression, let (3 = and then 

p(/3) = G&(P\c, d) = d c p c ^ exp(-d/3)/r(c), (23) 

where c, d are the parameters of the Gamma distribution. In the absence of prior information, 
small values for hyperparameters are chosen, i.e., aj = bj = c = d = 1CT 6 , so as to produce 



uninformative priors for the channel and noise power [10|. 
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B. Variational Inference 

With the introduced prior and hyperprior distributions, our aim is to jointly estimate fi,ct,(3 
and x d . In Bayesian framework, this corresponds to maximizing the posterior probability density 
function (pdf) ct, /3, x rf |y). However, this pdf is in general very hard to be obtained in closed- 
form and the maximization of it is inconvenient. In the VI framework, a Q(n, ct, f3, x d ) function, 
which is in tractable form but closely represents p(fjt, ct, (3, x d |y), is adopted to efficiently derive 
the estimation algorithm. The optimal Q(fi,ct, /3, x<j) function can be obtained by minimizing 
the free energy function defined as p6|: 



F= / Q{», ct, /3, x d ) log ^ "'^ * d \ d»dctd{3dx d . (24) 

Notice that, p(fi, ct, (3, x d , y) is used instead of p(fi, ct, (3, x d \y) because they are proportional 
and thus equivalent in free energy formulation. According to the mean-field approximation JT7J, 
Q(n, ct, (3, x d ) can be factorized into a product form, i.e., Q(n, ct, [3, x d ) = Q(fi)Q(ct)Q(/3)Q(-x d ) 
This is equivalent to assuming that n,ct,/3 and are independent conditioned on y and will 
greatly simplify the iterative algorithm. With the mean-field approximation, the variational free 
energy in ([24]) is given by 



¥ = I Q(fi, a, p, x d ) log -j— — Q^o^Xd — dndctd(3dx d 

>n, a ,p,x. d P(y|A*> P> Xd)p{v\ct)p{a)p(l3)p{x d ) 



Q(lJ,)logQ(iJ,)dfj,+ / Q(ct)\ogQ(ct)dct+ / Q(P)logQ(P)dP 

Ja J p 

+ / Q(x d )logQ(x d )dx d - / Q(fj,)Q(ct) log p(fj,\ct)dfidct - / Q (a) log p(ct) da 

J ~x.il J \i,a J a 

Qifi) logp((3)d/3 - \ Q(x d ) log p{x d )dx d 

Q(/j,)Q(/3)Q(x d ) logp(y|/Lt, /3, x d )dfxd(3dx d . (25) 



In order to calculate the free energy function given in (25 ), the likelihood function p(y\fi, (3, x d ) 
and the form of Q functions are needed, in addition to ( [20] ), pT) , ([22]) and ( ]23| ). Since the noise 
is assumed to be AWGN, the likelihood function is given by 

p(y\»,P,x d )=0^J exp{-/3(y-G[x] M Hy-G[x] M )}. (26) 

For (5(h), Q(ct), Q((3) and Q(x d ), they represent the approximate posterior distributions for 
the respective parameters, and should be chosen in a way that facilitate the manipulation. In 
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particular, in order to maintain the sparsity enhancing property in the approximate posterior 



distribution of the channel BEM coefficients, we choose QT0| ] 

1 



7r M det(S A 



exp{-(/u - m M ) ff S At 1 (^ - m^} 



Q(ctj) = Ga(aj\aj,bj) = b°- 3 en" 3 1 exp(—bjaj)/T(aj) 



(27) 
(28) 



with m M , S M a, and bj being unknown parameters. Furthermore, for composite noise power, we 
set Q((3) as 

Q{p) = Ga{(3\5, d) = d c f~ x exp(-J / 5)/r(c) (29) 
with c and d being unknown parameters. And for x d , in view of its discrete property, a close 

Q(x d ) = 5(x d - x d ), 



approximation is given as [18| 



(30) 



with x d being a parameter of Q(x d ). 



With all the distribution functions given above, the nine terms in (25) can be computed 
respectively. The detailed calculations are shown in the Appendix |A| With the obtained results 



( |49| ), ( |50| ), ( gTj ), ( |52| ), ( |54] >, ( |55) , ( |56| ), ( |57| ), ( pH) , and after eliminating some constant terms, the 
closed-form expression of the free energy function can be written as 



¥(m IJ ,,'E ll ,a j ,bj,c, d, 5c d ) 



<< a M -i f _~ _~ H , ^ 



-logdet^) + Tr diag{ . . . , p-]}(m M m 

1 01 »Af 

M 

+ X logfy + (a, - l)[^(aj) - log&j] - - logr(a^ 

3=1 
A/ 

- ^ [oj logfy + (aj - l)[*(5j) - logfy] - fya^/fy - logT( 

3=1 
M 

+ clog J + (c - l)[*(c) - logrf] - c 



3=1 



- log r(c) - (c - 1) [*(c) - log d] + dc/d - N[ty{c) - log d] 

+i [Tr{G^[x]G[x](A>f + S,)} + y^y - 2Re {y^G[x] A,} 



+X>{ £ '( 



x d in) - xAn 



n=l 



x d (n)£C d 



))}, 



(3D 
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where ^(a) = ^logT(a) is the digamma function, and we let x = E p x p + E d x rf to simplify 
the expression. Notice that, the free energy function only depends on m M , S M , dj, bj, c, d and 



C. Iterative Minimization of Free Energy Function 



After obtaining the closed-form free energy function (31 ), the next step is to minimize the free 
energy function in order to obtain the optimal m M , S^, dj, bj, c, d and St d . As the function depends 
on a large number of parameters, it is difficult to obtain the optimal parameters analytically in 
one step. The commonly used solution is to update each one in turn. In the following, it is shown 
that the closed-form solutions of and S M can be derived if the other parameters are fixed. 
Since the Q(fjt) is assumed to be in Gaussian form, the optimal BEM coefficient estimate is 
equal to the mean, i.e., m /t . Similarly, it is shown below that (dj, bj) and (c, d) can be updated in 
pairs. Furthermore, we can derive the optimal x d with Viterbi algorithm when other parameters 
are fixed. Therefore, Ffiri^, S M , dj, bj, c, d, Sc d ) is minimized iteratively, starting with a certain 
initial value, and is guaranteed to converge p9|. 



1 ) Minimization w.r.t. x d : Gathering the terms in (31) that involve Sc d , we have 



¥, d = i [Tr {G"[x]G[x](m,m^ + %)) - 2Re{y ff G[x]m /l }" 

+ E lo §{ 2 6{x d {n)-x d {n))}. (32) 

n=l x d (n)£C d 

Instead of treating x d as continuous and taking derivatives of ([32]), which does not guarantee 
the resulted x d will fall on the pre-defined constellation points, optimal x^ is searched on the 



constellation to minimize ( |32| ) as follows. 

First, it should be noticed that if we only search on the constellation points, then x d (n) E C d 
and J2x d (n)ec d 5{xd{n) - x d {n)) = 1 for all n. Thus 

J>g{ S(x d (n)-x d (n))} = J2log{l} = 0. (33) 

n=l x d (n)eC d n=l 

Moreover, c, d > from the property of Gamma distribution, then the factor c/d can be excluded 
in the searching metric, and we have 

F id = Tr{G i/ [x]G[x](A M ifi^ + i: M )}-2Re{y H G[x]i5i4. (34) 
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It should be noticed that, the objective function given in (34) depends on x rf in a highly 
nonlinear way, making it difficult to find a solution for the optimal x^. In order to proceed, we 



perform the eigen-decomposition S M = J2jLi ^j^j^f > ant ^ we nave 



Tr \ G H [x]G[x]E„ \ = ^ A ; <'G" x G x 



Putting (35) into (34), we obtain 



¥ id = mfG^GtxK + £ A^f G ff [x]G[x]^ - 2Re{y"G h [x]m / J. 



(35) 



(36) 



Due to the equality G[x]/x = D[/x]x derived from (18) and (19), (36) can be written as 

M 

¥, d = x H D H [m M ]D[m M ]x + ^ A^D^^D^x - 2Re{y H D[A M ]x}. 

3=1 



(37) 



Then we adopt the Viterbi algorithm [20 1 to minimize (37). The frequency domain signal x is 
treated as a sequence of data and the correlation between data is determined by the ICI. Strictly 
speaking, for DSC, all the elements of D[m M ] and D[£ ■], j = 1, . . . , M are nonzero. But the 
entries close to the main diagonal are more prominent compared to those further away from 
the diagonal. This means that, the matrices Dfrh^] and D[£^] can be approximated as a banded 
matrix of bandwidth B = 2k + 1, as shown in Figure 12} This approximated banded structure 
of DSC matrix considers the most significant ICI from the left and right k closest symbols, 
and has been widely used in the literature [(6), p0| . As shown in Figure [2j the upper right 
and lower left nonzero entries of the matrix cause cyclic interference between the first and last 
subcarriers, making the problem different from the uni-directional convolutional code decoding, 
where Viterbi algorithm is commonly used. In order to avoid the complication, we set the first 
and last k symbols to be zero. 

With the banded structure of Dlrh^] and D[£ ■], the branch metric becomes 



n,n—K:n—2K ["-In— 2re:ra 



M 



j]]n,n— K:n— 2k, [^]n— 2«:n j\\n,n— K:n— 2k \p^\n— 2re:n) 

3=1 

-2Re{y*(n - «;)[D[m M ]] 

n,n— K:n— 2re[^]n— 2re:n} - (38) 



Since known pilot symbols are inserted between unknown data, slight modification to standard 
Viterbi algorithm is needed. For n E 3 P , the true pilot value will be used to calculate the branch 
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value, and the number of remaining paths will reduce by half because of merging. And for 
n E 3d, the algorithm will perform as normal Viterbi algorithm. 

2) Minimization w.r.t. m^, S M , cij, bj, c and d: The optimal values of other unknown param- 



eters are obtained by setting the first order derivative of (31) with respect to the corresponding 

parameter to zero. As shown in Appendix [Bj we obtain the following set of solutions: 

-l 





= (^diag | 


di 

A"" 


' b\t. 




= §E„G" 
a 


x]y 






= aj + 1 






h 






C 


= c + N 






d 


= d + y H y 


- 2Re 


{y H G 



+ --G"[x]G[x] 
a 



(39) 

(40) 
(41) 
(42) 
(43) 

m M mf + S„) } . (44) 

It is worth noting that, along with each update, when |[m^]j| 2 + [S^j gets close to zero, 
meaning both mean and variance of the corresponding [Lj are close to zero, then fij can be 
treated as null entry and pruned from further iteration. In practice, a threshold with the order of 
1CT 10 is used to compare with |[m M ]j| 2 + [S^j to determine which \ij is being pruned [21 1. 



D. Summary of the Iterative Algorithm 

We summarize the parameter updating procedure as follows 
Initialization: Choose initial values {a?, . . . , d° M }, {b®, . . . , b° M }, c°, d° and x°. 
Iterations: For the i th iteration 
Updating the parameters of GCE-BEM coefficients 



-at 1 



~ c i-l 



dia g { r^-, . . . , ^c-i i + ^G^r^tic 



i-V 



p-1 



111 



Updating the hyperparameters of GCE-BEM coefficients 



CLj + 1 
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Updating the estimate of data 

M 

minF, d = - 2Re {y H D[m M ]x} + x^D^m^Dfm^x + ^ A^D^jD^x, 

3=1 

where £* = J2f=i* using Viterbi algorithm. 

Updating the hyperparameters of noise 

~e = c + n 

d l = d + y H y - 2Re {y^G^m^} + Tr {g"[x' ; ]G[x 1 ] (K«) H + K) } 
Pruning 

If | [m /t ]j| 2 + < 1CT 10 =>■ fij = 0, remove dj from {d\, . . . , d\ { } and bj from {6^, . . . , 6^} 

and update the expression of G[x l ] by removing the j th column. 

End 

It is worth noting that the values of {di, . . . , and c remain the same for each iteration, thus 
they should only be updated in the first iteration in practice. 

V. Initialization of the Iterative Algorithm 

In order to start the iteration, one of the quantities we need is the initial data estimate x^. If an 
initial channel estimate can be obtained from pilots, then from ( fT9] ), the initial data estimate can 
be obtained. In Section |ni} GCE-BEM with large oversampling factor is used to represent the 
channel. Though this provides flexibility for selecting important bases in the iterative channel 
estimation and data detection, the large number of unknown coefficients corresponding to GCE- 
BEM also brings challenges in the initial value estimation, which is important to any itera- 
tive algorithm. Unfortunately, traditional compressed sensing methods like BP and orthogonal 
matching pursuit (OMP), whose performance highly depends on the accurate knowledge of noise 
variance, is not applicable in our case, since the variance of the composite noise is unknown. 

On the other hand, notice that the proposed iterative algorithm does not rely directly on 
the estimate of to start the iteration. The channel is needed only indirectly through initial 
data estimate. Thus, during initial data detection, we choose to expand the channel with CE- 
BEM, which corresponds to choosing the oversampling factor of GCE-BEM as V = 1. Then 
least squares (LS) algorithm is used to obtain the initial channel estimation. Though CE-BEM 
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is widely reported as relatively inaccurate among other BEMs, it represents the time-varying 
channel using a very small number of orthogonal bases. This is highly beneficial for a rough 
estimate at the initial stage where there is little knowledge of the channel and source signal at 
the receiver. 

According to (S, x = E^ + EpXp. Together with the fact that G [E p x p + E d x d ] = G[E p x p ] + 



G[E d Xd], (18) can be written as y = G[E x p ]/it + G[E d xJ/Lt + v. Collecting the output samples 
corresponding to pilot positions 3 P , the equation for initial channel estimation can be written as 

y p = Gp[E pX p]^ + Gp[E d x d ]^+Vp, (45) 

where G p [.] is constructed from the rows of G[.] corresponding to 3 P . The initial channel 
estimation can be obtained by treating the second term of ( [45] ) as noise and performing LS 
algorithm: 

fi = (Gf [E^xJGpfE^pD^Gf [E p xp]yp. (46) 



With the estimated channel fi, we rewrite ( 19) as y = D[/jt]E d x d + D[/i]E p Xp + v. Applying 
LS estimation again, we have 

x° = (E? D if [A]D[A]E d )- 1 EfD^[A](y - D[£]E pX p). (47) 

The obtained x° may not reside on the constellation map, thus quantization is performed on 
x° and the initial data detection is given as x° = Qant[x°]. Notice that in DSC, the ICI is not 



negligible, and Gp[E d xJ^t ^ 0, which decreases the accuracy of estimation in (46), and in turns 
affects the accuracy of initial data detection. This is the reason why an iterative algorithm is 
necessary. 

For other initial values {a®, . . . , d° M }, {b®, . . . , b° M }, c°, d° in the iterative algorithm, it is should 
be noticed that only the ratios dj/bj and c/d are required, thus we only need to specify the initial 



values of the ratios to start the iteration. From (28) and the property of Gamma distribution, 
dj/bj represents the mean value of <x,, which is the inverse variance of channel GCE-BEM 
coefficients. Since we have no information about their relative values, we can set them to be 



equal. That is, let = \jM for all j. Furthermore, from (29) and the property of Gamma 

distribution, c/d = E{/3} = E{l/cc^}. Therefore the initial value can be set as c°/d° = 1/^, 
where wl is an estimate of noise power 

w$ = y- G[E Xp + E d 5c° d ]fi. (48) 
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VI. Simulation Results and Discussions 

In this section, simulation results of dual-hop and three-hop cooperative OFDM systems are 
provided. In both systems, each OFDM symbol has 128 subcarriers and the length of CP is 8. 
Carrier frequency is f c = 2GHz and the sample interval is T s = 2fxs. The channel hf. iP (n,l) is 
generated according to zero-mean complex Gaussian distribution with autocorrelation of the I th 



tap given by E{h kjP (m,l)h kiP (n,l)} = o% tP (l)J (2nf ktP (!)(m-n)T a ) [|22j, where J (-) represents 
the zero-order Bessel function of the first kind, and cr| (I) is the power of the I th tap. Fourteen 
pilot clusters are used. The clusters are equal-spaced and interleaved with data subcarriers. In 
each cluster, one nonzero pilot is guarded by one zero pilot on each side. The nonzero pilots 
are generated as zero-mean complex Gaussian random variables with power three times that of 
data symbols. And the data is modulated with QPSK of unit power. The normalized channel 
mean-square error (MSE) and data detection bit error rate (BER) are plotted to demonstrate 
the performance. The MSE of the channel estimate at the i th iteration is defined as MSE* = 
||H l — H|| 2 /||H|| 2 , where H l is the channel matrix recovered from the GCE-BEM estimate at the 
i th iteration. The noise power at the relays and destination are set to be the same w\ = w\ , for 
all k and p. The signal-to-noise ratio (SNR) in the following figures is defined as SNR = of/ro 2 , 
(9j. The oversampling factor is chosen as V = 20 for GCE-BEM. Each point is obtained by 
averaging the results over 1,000 runs. 

For the dual-hop system, two relaying paths (K = 2) are considered. For both relaying paths 
(k = 1, 2), the maximal normalized Doppler shifts^] are set as 0.05 for the first hop and 0.15 for 
the second hop. In the simulation, for each specific channel, one randomly chosen tap has Doppler 
shift equals the maximum Doppler shift specified above. And for other taps, their Doppler shifts 
are uniformly drawn within the range from to the maximum Doppler shift. Both source-relay 
channels have 2 taps. One of the source-relay channels has tap positions uniformly drawn from 
{0, 1, 2} while the other has tap positions uniformly drawn from {0, 1, 2, 3,4}. And both relay- 
destination channels have 2 taps, with the tap positions uniformly drawn from {0,1,2,3} for 
one channel and from {0, 1,2} for the other. All the channels follow exponential power delay 
profiles normalized to unit power. For the Viterbi equalizer, k = 3 is chosen. 

Figure [3] and Figure [4] present the convergence performance of the proposed iterative algorithm 

'Normalized Doppler shift is defined as NfdT s with fd being the Doppler frequency. 
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in terms of MSE and BER, respectively. SNR is set at lOdB, 20dB and 30dB. It can be seen 
that the MSEs and BERs improve significantly in the first few iterations and converge to stable 
values before 10 iterations. 

Figure [5] and Figure [6] show the MSE and BER performance achieved by the proposed iterative 
algorithm versus SNRs. The results are taken after 10 iterations in order to guarantee convergence. 
In the figures, KLEM represents the performance of the EM algorithm with channel expanded on 
Karhuen-Loeve (KL) bases. This algorithm is an extension of the KLEM algorithm for single- 



hop case [23 1, and the detail is not included in this paper due to space limitation. The KLEM 
algorithm requires full information on channel tap positions, Doppler frequencies and power 
profile of each channel, together with noise statistics, thus serves as a reference for optimal 
performance here. And CRLB curve represents the Crame-Rao lower bound, which can be 
obtained from p3| by replacing the single-hop channel and noise power with the composite 
channel and composite noise power. Meanwhile, ideal case with full channel information at the 
receiver is also depicted as the performance bound in the BER figure. From Figure [5] and [6j it 
can be seen that, the proposed iterative algorithm successfully eliminates the interdependence 
between data detection and channel estimation, and exhibits significant performance improvement 
compared to the initial channel estimation MSE and data detection BER. Furthermore, though 
the proposed iterative algorithm does not have access to the relaying system structure (number of 
available links K) or any statistical information of the channel and noise, there are only minor 
performance gaps between the proposed method and the KLEM. Furthermore, the proposed 
algorithm and KLEM are very close to the ideal data detection in terms of BER performance. 

For three-hop relaying system, the maximal normalized Doppler shifts for the first and third 
hop are set as 0.05 while that of the second hop is set as 0.15. Two relaying paths are considered 
(K= 2). For the first relaying path, the number of channel taps in the three hops are {2, 3, 2}, 
respectively; while that for the second relaying path is {3, 2, 2}, respectively. The channel taps 
in each hop are consecutive. For the Viterbi equalizer, k = 4 is chosen. 

Figure [7] and Figure [8] show the MSE and BER performance achieved by the proposed iterative 
algorithm versus SNRs, with results taken after 10 iterations^} In both figures, performance 

2 As the convergence performance of a three-hop system is similar to that of dual-hop system, the convergence figures are not 
shown here. 
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curves of KLEM, which demands detailed information of relaying system structure, channel and 
noise statistics, are depicted as a reference for optimal channel estimation and data detection. 
From Figure [7J it is seen that, the proposed iterative algorithm greatly improve the performance 
from the initial channel estimation, indicating the ability of the proposed algorithm to cancel 
interference between unknown data and pilots through iterations. Furthermore, after convergence, 
only a small performance gap exists between the proposed algorithm and KLEM, which touches 
the CRLB at high SNRs. This exhibits the strong ability of our proposed algorithm in learning 
the statistics of both channel and noise. From Figure [8j the BER performance of our proposed 
method is also shown to improve significantly compared to the initial data detection and is 
very close that of KLEM algorithm. From these figures, it can be concluded that, though the 
system model in a three-hop system is more complicated than that of dual-hop case, the proposed 
algorithm continues to present good performance in terms of both channel estimation MSE and 
data detection BER and demonstrate robustness in a variety of OFDM relaying systems. 

VII. Conclusions 

In this paper, channel estimation and data detection for multihop OFDM relaying system 
under high mobility has been investigated with focus on unknown channel orders and Doppler 
frequencies. By exploring the matrix structure of channels in different hops, we first simplified 
the multihop multilink channel matrix into a composite channel matrix. Then the composite 
channel was represented using GCE-BEM with a large oversampling factor so that sparsity on 
the delay-Doppler domain was introduced. Sparsity enhancing Gaussian priors with Gamma 
hyperpriors were adopted to enable the identification of nonzero entries. A pilot-aided iterative 
algorithm was developed under variational inference (VI) framework, using only limited number 
of pilot subcarriers in one OFDM symbol. The proposed algorithm iteratively estimates the 
channel, recovers the unknown data using Viterbi algorithm and learns the channel and noise 
statistical information. Simulation results showed that, even without any specific information on 
system structure, channel tap positions, channel lengths, Doppler shifts and noise powers, the 
proposed algorithms exhibited performance very close to that of an optimal channel estimation 
and data detection algorithm, which requires all of the above information. 
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Appendix A 
Calculation of Free Energy Function 



Taking logarithm on (27) and substituting the result to the first term of (25), we obtain 
Q(v)\ogQ(n)d» = -MlogTr-logdet^-mfS^m^ 

+2Re {E^jS; 1 !^} - Tr {s^E^}' 
= -Mlogvr-logdet^)-!?!^ 1 ^ 



+2mj£; 1 m M - Tr {s^m^m* + 
—M log Tr - log det(S M ) - M. 



(49) 



On the other hand, from the form of Q{otj) given in (28), and notice that Q(ot) = YljLi Q( a j)> 
we can compute the second term of ( f25| ) as 



M 



Q(ol) log Q(ol) da. = Icij logbj — logr(aj) + (a^ — l)E{loga:.,} — bjE{ctj} 



M 



[«jiog&j -logr^-; 

3=1 



+(a j -l)[^(a j )-logb j )-b j (a j /b 



M 



^[io g 6 i -io g r( 

3=1 



(50) 



5 



where the digamma function \l/ is defined by \Ka) = ^logT(a). Furthermore, since Q((3) is 

aa 

in the same form as Q{aj), following a similar derivation as above, it can be easily shown that 
the third term of ( |25| ) is 

Q{/3) log Q{/3)df3 = logd-logr(c) + (c- l)^(c) -5. (51) 



Based on the Dirac delta function in (30), the forth term of (25) is given by 
C?(x d )logQ(x d )dx d = log Q(5td) = log<5(x d -x d ) = 0. 



(52) 



Furthermore, from (20), we have 



logp(/i|a) = -Mlog7r - logdet(A r )-^i H An 



-Mlogvr + log - Tr {A/x/x H } . 



(53) 
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And the fifth term of (25) can be computed as 
Q(fj,)Q(a) log p(fj,\ oc) duda. 



M 

-M logn + J2 E «{!og «il " Tr{E a {diag{a}}E /1 {/x/x H }} 



M 



Mlogvr + (*(2 3 -) - logfy) - Tri diag 



Q>1 a M 

hi ' b M 



m M m* + EJ k54) 



From (21 ), we can compute the logarithm of p(a) = n^iP( a i)' an d the sijc//i term of (25) 

can be written as 

» a/ 

/ Q(a) logp(a)da = \^ [ a j l°g^i — l°gr(a 3 -) + (aj — l)E a {loga;j} — 6jE{a 3 -}] 



3=1 
M 



J2 h- logfe,- - logr(a,) + (a, - l)(*(aj) - log 6,) - 6^] (55) 



Similarly, as and p(/3) are in the same form as Q(oij) and p{o.j), we can easily show that 
the seventh term of ( f25| ) is 



Q(P)logp(P)dp = c\ogd-\ogT(c) + (c - 1) (#(c) -logd) - 



(56) 



From (22), we can obtain the logarithm of p(x d ). Together with (30), the eighth term of (25) 
can be derived as 

N d 



(57) 



/ Q(x d ) logp(x d )dx d = / 5(x d - x d ) ^log { ^ 5(x d (n) - x d (n))}dx d - log{A^^ d } 

^ Xd ^ Xd n=l x d (n)eC d 

^ r i 
= Xl lo s{ Yl S{x d {n)-x d {n))j - N d log{M d }. 

n=l x d (n)eC d 

Finally, taking the logarithm of (26), we have the ninth term of (25) given by 
/ Q(fj.)Q(P)Q(xd) logp(y|/z, (3, x d )dfidpdx d 

= -iVIogvr + iVE^log/3} - E^/3} [y H y - 2Re{y^G[E p x p + E d x d ]E /i {/x}} 
+Tr{G H [E p x p + E dX(i ]G[E p x p + E d 5c d ]E^{/x/x H }} 



(58) 



-iVIogvr + iV (*(c) - log(d)) - [y"y - 2Re |y H G[E p x p + E d x d ]rIi M } 
+Tr {G H [E p + x p + E d x d ]G[E p x p + E d St d ] (m^mjf + E, 
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Derivation of Updating Functions 



Updating (m M , £ M ) given dj, bj, c, d and x d 



Focusing on the terms in F related to £„, we have 



d¥ 



° ' logdet(S M )+IY{diag{[5,...,e^]}S M } + §[G^[x]G[x]S /1 

bi b M d 



as 



-S- 1 + diae 



a i 



<2m 



+ --G*[x]G[x] 
d 



Setting (59) to zero leads to (39). On the other hand 



d¥ 
9m„ 



<9 



dm. 



Tr < diag 



ClM 
t>M. 



— 2Re |y H G[x]m M } 



+G H [x]G[x]m At mf 

9 



diag 



a i 



diag 



b~M J 



+ --G"[x]G[x] 

a 



m, 



"m?G"[x] 



d 



+ --G«[x]G[x] 
d 



m M --,G-[x]y 
d 



Setting (60) to zero leads to (40) 



Updating (dj, b j) given m M , S M , c, d and x<i 



Gathering the terms in F that are related to dj, we have 

d¥ d 
ddj 



dcij 



Tr < diag 



bi 



+ 



log &j + (dj — 1) ^(oj) — log&j — aj — logT(a 



(aj - 1) *(aj) - \ogbj - bjdj/bj 



m 



+ logL + (dj - l)^'(aj 



+*(Sj) - log&j - 1 - *(5j) - (a,- - + ^ 



(dj - aj - l)^'(dj) - 1 + — 
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Similarly, 



d 



dbj dbj 



Tr < diag 



M J 



m u mf + £J >> + \ogb 



+ 



aj logbj — (dj — 1) log Oj — (cij — 1)(— logo.,) — bj<ij/bj 



1 aj aj — 1 aj — 1 bjdj 



aj + 1 aj 



in 



(62) 



Setting both (61) and (62) to zero and solving the simultaneous equations, we obtain (41) and 



(42). 



Updating (c, d) given m^, £„, a,-, and x d 



Following the procedure in updating other parameters, we compute 

dc 



8 I rd 
^-l (5-1) *(c) - c - log r(c) - (c - l)^(c) + ^ - iWg) 
«c I 



and 

dd 



~ 2Re {y H G[x]m M } + Tr {g* [x]G[x] (rn^mj + S M 

(c - l)tf'(c) + *(c) - 1 - *(c) - (c - l)*'(c) + t - JVtt'(c) 

d 

+l[y H y-2Re{y^G[x]m M } +Tr{G H [x]G[x](m M mf + 
(g - c - jV)*'(c) - 1 + - [d + y H y - 2Re {y H G[x]m M } 
+Tr{G^[x]G[x](m /i mJ + S, 



J c logd+ -j+ Nlogd+ - y H y - 2Re y H G[x]rii 
dd 1 - d d l ^ J 

+ Tr{G"[x]G[x](m>f + %»)}]} 



(63) 



■ T [d + y H y- 2Re{y*G[x]m / J + -X- + Tr{G" [x]G[x](m^mJ + E„)}j64) 

er a 



Setting both (63) and (64) to zero and solving the two simultaneous equations, we obtain (43) 



and (44). 
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Fig. 1. Multihop Cooperative Communication System 




Fig. 2. Banded Matrix Structure Approximation for D[m M ] and D[£ •] 
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Fig. 3. Convergence of Channel Estimation for Dualhop OFDM System 
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Fig. 4. Convergence of Data Detection for Dualhop OFDM System 
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Fig. 5. Performance of Channel Estimation for Dualhop OFDM System 
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Fig. 6. Performance of Data Detection for Dualhop OFDM System 
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Fig. 7. Performance of Channel Estimation for Three-hop OFDM System 
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Fig. 8. Performance of Data Detection for Three-hop OFDM System 
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